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Claims 1, 2, 9, 10, 17, 18 and 19 were amended. Claims 6-8 and 14-16 were previously 
withdrawn, as being drawn to a non-elected invention. The informality in claim 1, line 
17 was removed. 

Claims 1, 2, 9, 10, 17, 18 and 19 were amended to incorporate the organization ol'lhc 
data structures and dictionaries embodied in the present invention, as shown in FIG. I, 
FIG, 3 and described in paragraphs [0021], [0022% (0023]- [00531 and |0062] of the 
present application. According to these references, each subject area 300 comprises a set 
of sub-subject areas 302, each sub-subject area (or subject area) comprises a set of 
program modules 310, each program module 310 comprises a set of arguments 320 and 
each argument 320 comprises a set of values 330. In the example of paragraphs [00261- 
|00531 the subject area 300 is "automobiles'*, the program modules 310 are "sales, 
service and financing", the arguments 320 are "display price, submit offer, check status, 
and obtain financing", and the argument values 330 arc "purchaser id, vehicle id, 
offer/price". This data organization is implemented in the dictionary, subdictionary 
systems of HO. 1 (i.e., subject area dictionary, program module dictionary, argument 
dictionary, and value dictionary) and is mapped onto the parsed information of an 
unrestricted free and continuous speech natural language utterance and then used to 
sequentially identify parameters including subject area identifier, program module 
identifier, argument identifier and value identifier. These parameters arc used by the 
computer system in order to produce computer instructions. This data structure and the 
process of identifying the above mentioned parameters are unique to this invention and 
are essential to the claimed method and apparatus for providing computer understanding 
and instructions from unrestricted natural language. 

Claims 1, 2, 9, 10, 1 7, 18 and 19 were also amended to clarify that the natural language 
utterance is Ci a free continuous speech natural language'' utterance, as described in the 
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related provisional application 60/274,786, segment PI, page 2, lines 14-15 (see 
Appendix A). 

The Examiner rejected independent claims 1, 9, 17, 18 and 19 under 35 USC 102(c) as 
being anticipated by Block (6,073,102). The Rxaminer argued that Block teaches the 
steps of determining a subject area identifier, determining a program module identifier, 
determining an argument identifier and determining a value identifier in column 8, lincs- 
1 5-25, column 8, lines 30-38 and column 10, lines 30-55, column 10, 41-55, and column 
15 lines 36-40, respectively. 

We respectfully disagree with the Examiner's interpretation of the cited text references in 
Block. Referring to column 8, lines 5-23, lines 57- 62 and FIG. 3, Block teaches 
determining an action indicator in step 303 by employing a parser and defining an action 
in step 304. The definition of an action is done by "allocating a prescribable plurality of 
key concepts each of which respectively characterizes the action, to each action and in 
comparing the action indicators determined from the action information that arc defined 
by the parameter parser PP to the key concepts. The comparison can be undertaken by a 
direct word comparison or, on the other hand, by an arbitrary pattern comparison." In 
other words, Block does not use a "context sensitive subject area dictionary system" as is 
the case in the present invention as claimed in claims 1, 9, 17, 18 and 19. Actually, the 
word "dictionary" is not mentioned anywhere in the entire Block patent. Furthermore, 
the term "context sensitive" has a particular meaning in the field of linguistics and 
computer science. According to Webster's New Millennium Dictionary, context- 
sensitive is defined as "in linguistics or computer syntax, pertaining to an element whose 
value depends on the context in which it appears." (see Appendix B). Block does not 
make any reference to a dictionary or to a context sensitive dictionary. Furthermore, as 
described in the related provisional application 60/274.786, segment Pl/3 pages 2-4, ht a 
direct" method of word recognition by "dragging" each word along a dictionary is a slow 
and inefficient process for developing computer understanding (see Appendix C). The 
present invention overcomes the inefficiencies of the direct word comparison prior art 
methods by structuring the data in subject areas, sub-subject areas, program modules. 
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arguments and values and by utilizing context-sensitive dictionaries for each subject area, 
sub-subject area, program module, argument and value, respectively for developing 
understanding of a free and continuous speech natural language utterance by a computer. 

The Examiner compared the "subject areas" of the present invention to the key concepts 
of the Block patent, the "program modules" of the present invention to the aclions, the 
"arguments" to the parameters and the "values" to the parameter related information. We 
respectfully disagree with this comparison. 

Referring to column 8, lines 30-38, column 10, lines 39-55 7 column 15, lines 38-39, 
Block teaches 

" A first set of actions is identified wherein all action indicators coincide with at least a 
portion of the key concepts'*. 

" The individual actions can, for example, be characterized by the following param eters: 
Rail information (point of departure, destination, date, time of day) " 

" 1 would like to travel by train from Munich to Hamburg on Jan. 1, 1 996 at 5:00/" 

In other words. Block uses actions (characterized by key concepts), parameters, and 
parameter related information, i.e., three logic al elements . The key concepts characterize 
the actions and do not constitute a separate element (see column R, lines 17-18 and 
column 15, lines 13-15). On the contrary, the present invention utilizes four logica l 
element s, i.e., subject areas, program modules, arguments and values. Accordingly, it is 
not logically possible to make a direct comparison between the three logical elements of 
the Block patent with the lour logical elements of the present invention. Tn order to 
overcome this logical discrepancy the Examiner interpreted ihe key concepts as a fourth 
element and compared it to the subject areas. We respectfully disagree with this 
interpretation . 

Furthermore, the present invention applies to "free and continuous speech natural 
language utterances" whereas the Block patent refers to " a command" (column 2, lines 
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64-65) and "dialog arrangements" (column 1. lines 5-6). The problem with these prior arl 
language forms is that they rely on limited number of keywords describing the key 
concepts for developing computer understanding. For example, referring to Block, 
Column 15 lines 12-32, the key concepts for action {2}, i.e., air information, are 
described with keywords " Fly, Flight information, Airplane". The problems with the 
limited number of keywords describing the key concepts for the air information action 
are: 

a) they arc limited. A speaker may use other words tor air transportation ,i.c., plane, 
helicopter, jet, etc., which arc not included in the limited set of keywords describing the 
key concepts 

b) polisemy. Different meanings of similarly sounding words lead to confusing 
statements. 

c) they restrict the "free and continuous speech natural language utterances" 

The present invention addresses these limitations of the prior art solutions by utilizing 
context-sensitive dictionaries and by structuring the data in subject areas, sub-subject 
areas, program modules, arguments and values for developing understanding of 
unrestricted "free and continuous speech natural language utterances" by a computer. 

In summary, the differences between the present invention as claimed in claims 1> 9, 17, 
1 8 and 1 9 and the Block patent include: 

• Use of context sensitive dictionaries 

• Organization of dictionary data in subject areas, sub-subject areas, program 
modules, arguments and values. 

• Direct comparison of the parsed natural language utterance with the organized 
context sensitive dictionaries in a stepwise process for identifying a subject area 
identifier, a module identifier, an argument identifier and a value identifier. 

• Production of computer instructions based on four logical elements, i.e., the 
subject area identifier the module identifier, the argument identifier and the value 
identifier. 
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• Production of computer instructions from unrestricted "free and continuous 
speech natural language utterances 11 . 

Rased on these differences wc conclude that claims I, 9, 17, 18 and 19 arc patentably 
distinguishable form the Block patent, and they should be allowed. Claims 2-5 depend 
upon claim 1 and claims 10-13 depend upon claim 9. Since claims 1 and 9 are 
patentably distinguishable from Block they should also be patcnatbly distinguishable 
from Block and should he also allowed. 

In view of the above, it is submitted that claims l-5 ? 9-l3 ? 17, 18, 19 arc in condition lor 
allowance. Reconsideration of the rejection is requested and allowance of these claims at 
an early date is solicited 

If this response is found to he incomplete, or if a telephone conference would otherwise 
be helpful, please call the undersigned at 781-235-4407 

Respectfully submitted. 



Aliki K. Collins, Ph.D, 
Reg, No. 43,558 

AKC Patents, 215 Grove Street, Newton, MA 02466 
TLX: 781-235-4407, FAX: 781-235-4409 

Certificate of Mailing 
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I hereby certify under 37 CFR L10 thai this correspondence is being deposited with the 
United States Postal Service as " Tirst Class Mail" in an envelope with sufficient postage 
on the date indicated above and is addressed to the Commissioner for Patents, P. (). Box 
1450, Alexandria, VA 22313-1450 
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A METHOD OF COMPUTER UNDERSTANDING 
USER'S NATURAL LANGUAGE INSTRUCTIONS 
IN A DIALOGUE PROCESS OF PROBLEM SOLVING 
IN AN ARBITRARY WIDE STATIONARY SUBJECT AREA 



Tnventor: Professor Vitaliy S. Fain. Dr.ofSci., Ph.D. 



: CLATM: 

v;: A method of computer understanding uswr'a natural language instructions 

& in a dialogue process of problem solving in an arbitrary wide stationary 

^ subject area, 

said method differing in that: 

;jf - a computer is supplied beforehand with the set of all program modules 

having participated in all programs having been used in the subject area 

i*. (the stationarity of it securing such a possibility); 

j. & - a whole stationary subject area is structurized by dividing it into several 

subject subareas, which can be in their turn divided into sub-subareas and 
so on, and each sub-...-subarea is provided with a "sub-...-subarea 
identifier" SSI; some program modules are used in a single sub-...- 
subarea, the others can participate in more than one sub-subareas; 
* each of the modules is provided with a "module identifier" MI, and each 
of the arguments of the module is provided with the "argument identifier" 
AI and with the "measure of argument value identifier* 1 VI; 

- the Natural Language Instruction Understanding is determined as the 
formation by the computer of correct i.e. expected by the user reaction to 
his natural language address, which reduces "understanding" just to correct 
recognizing the identifiers SSI, Ml, AI, and VI in that address; 

- structurizing the subject area leads to not too large numbers of 
subareas in it, sub-subareas in subareas, modules m any sub-subaxea, 
arguments in any module, which makes the mentioned recogni7ing 
problems reasonably easy. 
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FUNDAMENTALS. 

There are computer systems of different destinations thai could be controlled, fully or 
partially, by orally pronounced natural language (NL) words and sentences. A subsystem of 
recognizing and understanding users' oral addresses - a speech understanding engine - must be a 
part of such a system. 

This invention relates to the variety of such systems that is usually caJlcd dialogue systems. 

We consider as dialogue such systems for computerized solving of practical (technical, 
medical, legal, managerial, commercial etc.) problems in which a dialogue is used between a user 
and a computer consisting of several exchanges of addresses. 

In this patent application, a "dialogue" is defined as an activity of two participants when one 
participant passes some information to the second participant, and the latter reacts to it somehow, 
and after that the sides swop their roles. 



J~ In more detail, this invention deals with dialogues of the following kind: 
- J Information passed to a computer by a human can be in the form of a usual free 

tl continuous oral speech in the user's natural language (NL). It can be referred to as 

^ the human user's instruction or address (H-addrcss, for short) to the computer. 



Still in more detail, this invention deals with just those phases of a dialogue when 
JT the human user gives NL instruction to the computer, and the computer reacts to that 

f U instruction by solving the current subproblem pointed out or defined by that 

li instruction. 

a 

It is an essential property of any dialogue that a computer acts absolutely autonomously 
between any two successive human user's NL instructions called "H-addresses". An episode of 
autonomous work of a computer between the end of a previous H-address and the beginning of 
the next H-address is called "Reaction and C-address forming 11 , C-address meaning information 
directed by the computer to the user. 

Now, any computer can perform a work only if it has a program for it. If the previous 
H-addrcss does not contain the description of the computer program for preparing the 
forthcoming reaction and C-address (which is usually the case with the dialogue systems in 
general and is even inevitable if Ii-addrcsses are NL), then it means that the programs of all 

PrcSM^ flfflfi o?M 

fixed program modules. 

We call a program module "fixed" if the following items in it remain unchanged: 

a) a body of a program forming it, 

b) formal names or identifiers AI of arguments of a function implemented by that program 
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View results from: Dictionary | Therms | Encyclopedia I lbs lflfeJt> 
aiato£££l,to Millennium™ Dictionary of tnahxh - Cite fhi* SaufA? 

Main Entry: c o n text ■ se n s i t i ve 
Part of Speech: adjective 

Definition: in linguistic or computer syntax, pertaining to an 

element whose value depends on the context in which It 
appears 

Example: The program offers context sensitive help, 

MfctaterV Nz*f Millennium 7 * Dictionary of rngiish, Preview Edition {v 0.9 6t 
Copyright # 200'S 2005 U*ico Publishing Croup, LLC 
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09/22/2006 08:47 PM 
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Perform a new search , or try your search for "context-sensitive" at: 
Amazon.com - Shop tor books, music and more 

High Beam Research - 32 million documents from leading publications 
Mernam Webster - Search for definitions 
Reffirence.com - Encyclopedia Search 
Reference.com - Web Search powered by Coogle 
Thesaurus.com - Search for Synonyms and antonyms 



hrfn://riicr<rttufY.r^fpr«'n<'i--f.ftm/brow^^/<'ftnt» h yr-^n^iriv^ 



ADVC*TISFH*MT 



Related ads: 

RoboHein Training 
CHM 

Poc.talMe 

WebHelo 
WinHeip 



0 Indicates BTJBmMB 
content , which is 
available only to 
subscribers. 
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Provisional application fertile Dependent Subpatent 1/3 (Pl/3) 



METHOD OF HIGHLY ACCURATE RECOGNITION OF WORDS 
OF CONTINUOUS ORAL SPEECH IN A MAN-TO-COMPUTER DIALOGUE SYSTEM 
USING RECEIVED WORDS* PROCESSING BY INDIVIDUAL ALGORITHMS FOR EACH 
DICTIONARY WORD AND DECOMPOSITION OF LARGE SYSTEM DICTIONARY. 

Inventor Professor Vitally S. Fain, Dr.of sci., PhJX 



^ CLAIM: 

«« 

r 3 A method of highly accurate recognition of words of continuous oral 

f n speech in a man-to-computer dialogue system using received words' 

a processing by individual algorithms for each dictionary word and 

£3 decomposition of the large system dictionary 

differing by that: 

I* - The method is based on extending our highly accurate word 

It recognition Optimal Inverse Method applicable to small dictionaries 

11 to the cases of large dictionaries; 

- The extension is provided by decomposing the large dictionary of a 
dialogue system into the structurized set of small subdictionaries, that 
structure reflecting the structure of the subject areas and activities in them 
the system deals with; 

- An Optimal Inverse Method is proposed for the highly accurate 
recognition of words of a small subdictionary, that method being based 
On forming for each subdictionaiy word an individual word processing 
algorithm that takes into account the individual properties of that 
subdictionary word as fully as possible and is applied to a received word 
prior to comparing it with that subdictionary word; 

- A deterministic (not full search based) method is proposed of detecting 
separate words in continuous speech thus making the Optimal Inverse 
Method applicable to the continuous speech; 

- Methods of the use of mentioned large dictionary decomposition are 
proposed for overcoming the problems of homophony, homonymy and 

synommy in computer speech understanding. 
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The present state of the art 

When offering any new method, it is necessary to show why the "old" ones are not good 
enough. In order to do so let us consider the existing methods of computer word recognition. 
They will be called "direct" below to stress their difference from the offered here method called 
"inverce". 



The present invention deals with natural language dialog systems with large vocabularies 
(a large vocabulary is one containing thousands to dozens of thousand words or more), 
sj ft » known that practically all modem word recognition engines intended for continuous 
^ speech and large dictionaries are based on a Speech Units principle. The principle suggests seeing 
^ in a continuous speech signal, a sequence of "speech units", mainly speech sounds (phonemes), 
?8 diphones, Billables etc. It is an attractive idea because the number of classes in the primary pattern 
0 recognition problem which is the number of speech units to be recognized is comparatively very 
* usuaU y no more than several dozens. As a result of solving that primary problem, a word 

1 3 initially represented by a continuous speech signal turns out to be described in terms of speech 
it JS?* names ' thflt k m * sy^lc form (SF). Prototypes of words in the engine dictionary 

C dictionary words") also are represented in SF. The problem of final word recognition is solved 
by comparing SF of a received word with SFs of dictionary words and looking for the best 
likeness. 

... K & i/3-1 represents a typical flow chart of oral speech word recognition system based on the 
direct method. — - Y -- — ... ., .. „ 

A speech signal representing a pronounced word comes from a microphone ( 1 ) and is 
subjected to some prehminary processing, for example, segmentation and initial evaluation of 
segments parameters ( 2 ). The segments then undergo the main processing by means of the 
chosen algorithm ( 3 ) resulting in the sequence of speech units supposedly fanning SF of a 
received word ("receivedword SF" below). This SF is compared in ( 4 ) with SF of the prototype 
of each dicnonary word {"dictionary wordSF" below). Presenting those dictionary word SFs for 
companson m due moment* and order is controlled by the unit ( 5 ) summoning them one by one 
ftorn the system dictionary ( 6 ). It can be said in this case that the unit ( 5 ) performs screening of 
the dictionary, and that each received word is "dragged" along the dictionary. 

Now it is possible to point out what is the problem whh these "direct" methods I 
The^problem consmts in the inherent property of the Speech U^iprincipr^ltoentail the loss of 
an important for word recognition part of information contained in a received word \ 
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The mentioned loss of information is connected with the dual nature of an oral word 
On the one hand, different sounds in a word really form its different and dist ingu ishable parts. 
They have very different physical nature, are formed in different parts of the voice tract by 
different sound sources and by different laws of physics. This side of the word's nature provides 
the understandable motivation for the Speech Parts principle and individual properties of each 
sound {"heal features" below) are usually used in both forming and recognizing speech units as 
their distinctive features. 

On the other hand, the continuity of a speech signal caused by the continuity and inertia of 
voice tract organs 1 movements leads to the feet that the properties of a sound in a word strongly 
depend not only on the sound itself but also on the adjacent and often even more remote sounds 
of a word. This side of the word's nature suggests regarding a word as something whole and 
therefore having the properties of a whole or integral properties. These integral properties can 
form strong distinctive features of a word that can be used in word recognition ( "integral 
features" Wow). It, for example, a word contains more than one vowel, then the correlations 
between energies of those vowels and between their lengthes can become strong integral features 
of the word. There can be lots of correlations of such kind in a word, and not only between just 
|j SP**** in i*> but also between groups of speech units or parts of the word. It means that 

integral features of this kind do not belong to any one specific speech unit in a word. Both the set 
^ of those integral features and the selection of programs to compute them (this selection realizing 
C3 m "Integral Features Computing algorithm", IFC algorithm) are related to the ward as a whole 
(fl and are urnc for just that word. 

w 

II . ? bvi0usly ' *• best recognition performance of an engine can be achieved if both the local and 
il mtegral features of words are used. It can be seen, however, that the "direct" methods do not 
;. t) present such a possibility. 

f 3 ™? reaaon * ^ m P te - 11 ** Possible, of course, at least theoretically, to include into a dictionary 
M . SF both ,ts local features and its integral features. However, it is impossible to include the 

integral features into a received word SF and that leaves the last with only local features. 

Therefore, in the "direct" method, the Integral features of a word cannot participate in the 

comparizon of SFs in ( 4, Fig. 1/3-1) which results in the inevitable decrease of the quality of the 

recognition engine performance. 

• ?* SOn ° {ihe nn P osdbi % of inclusion of integral features into a received word SF is as 
simple. Any attempt to compare a received word with one of the dictionary words automatically 
means that die hypotesis is made that the received word presents a version of the dictionary one 
therefore they must be described by the similar sets of parameters to be able to be compared, and 
the structure of the parameters' set is determined, at least for integral features, by the IFC 
algorithm of the dictionary word. However each dictionary word has Hs own tmiclFC algorithm 
and that means that prior to comparing with each dictionary word, the received word must be * 
processed by the IFC algorithm of Just that dictionary word. 

So in case of a system dictionary containing, for example, several dozens of thousand words, 
precognition of a single received word would require applying to it several dozens of thousand 
IFC algorithms pertaming to dictionary words. Even though it can be possible theoretically it is 
hardjy 'possible practically, and in any case the "direct" methods do not go for it Instead, a single 
word SF containing only local features is formed in ( 3, Fig.1/3-1) for a received word and this 
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received word SF in the absolutely unchanged form participates in the comparizons with all , 
dictionary word SFs. All the integral features remain unused 

That is why the existing "direct" methods of word recognition art not good enough 
indeed and why they must be considered non-optimal. 

v Unlike this, the "Inverse " method of word recognition described in this Invention does not ) 
loose any information about a wont it uses both its local and integral features. 

Hie essence of the "Inverse" method. 

An oral speech address of a human user to a computer system during their dialogue will be 
referred to as an H-address below, following the denotation introduced in the Draft of the 
application for the Main Patent PI ("Draft PI" below). In a dialogue, just an H-address is the 
tfi source of mentioned above "received 11 words. 

£3 In the "inverse" method, the dictionary entry of each dictionary word includes not only its 

description in the form of values of its local and integral features but also references to program 
t modules realizing the IFC algorithm of that word CTFU-program" below). When the moment 
I4 comes to compair a received word with the prototype of some dictionary word, not only local 
1% features of the received word are calculated but it is subjected to the processing by the IFC- 
*a program attached to that dictionary word, this processing creating the integral features of the 
*- received word in the tarns of that dictionary word. After that both local and integral sets of 
£3 features of the two words are compared and the degree of their resemblance calculated, 
id After that is accomplished, any one of the two possible modes of further operating can be 
*_ pursued; 

^ - a) The same received word from the Headdress passes to the next dictionary word and 
11 confronts with h. This is just the mode used in the "direct" methods and called "screening the 

dictionary " above except that in this case the received word each time undergoes a new 

processing by the IFC-program of each next dictionary word. 

b) The same dictionary word passes to the next received word from the H-address and 

confronts with it 

Of course, both these modes preserve the integral features and the possibility to use them in 
the word recognition. The word "Optimal" in the name of the proposed in this Invention word 
recognition method (as opposed to the B nonKmtimality ,, of "direct" methods) reflects this fact. As 
for the usefoness of each of the two modes, just the mode b) is suggested in this invention. 

The reason of this choice is quite simple. Let the number of dictionary words be AT, the number 
of received words in an H-address be W. 

In the mode a) the number of necessary reloadings of IFC-programs is obviously NW. At 
the same time, it is as obvious that in the modeb) that number is only N. 
It can be seen that the way the received words confront with the dictionary words in the mode b) 
ismasenset¥^tetotheonemthera proposed in this Invention word 

recognition method is called "Inverse". It seems also natural to call the process of confronting 
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